Lecture 10 – March 2 Lecturer : Prof Lillian Lee Scribes : Jerzy Hausknecht & Kent Sutherland More on Language Models

نویسنده

Lillian Lee

چکیده

In the previous lecture, we discussed the idea of relevance models, as presented in [Lavrenko & Croft 01]. For each query, a language model for relevance is constructed. The final product is a language model based on a collection of documents. The final model estimation details were very similar to query likelihood, even though the relevance model was derived from the ideas in [Robertson & Spärck Jones 76]. The relevance model work reinforces the “importance of being reversed” (Lafferty and Zhai’s phrasing), as assigning likelihood to a query from a document-based model yields better statistical estimates than assigning likelihood to a document based on a query-based model. While such an approach to deriving language models “works” mathematically, the intuition for it severely stretches the original assumptions made in the Robertson & Spärck Jones approach. This discussion starts with a mostly clean slate and attempts to justify the language model approach to query likelihood in a more intuitive fashion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distinguished Lecturer Tour of Norman C. Beaulieu in Skopje, Florence, Podgorica, and Belgrade in March 2014 [Global Communications Newsletter]

The idea for Prof. Norman Beaulieu’s DL Tour was raised during the Globecom 2013 conference in December 2013 in Atlanta, when Prof. Beaulieu and Prof. Zoran HadziVelkov, the Chair of the R. Macedonia ComSoc Chapter, first discussed it. The Distinguished Lecturer Tour was organized at the beginning of 2014 to include four different European countries: Republic of Macedonia, Italy, Montenegro, an...

متن کامل

CS 630 Notes : Lecture 5 Lecturer

1 Review of Classic Probabilistic Retrieval Model Previously we modeled the problem of retrieval as follows: we will try to calculate the probability P (relevent|doc) given a fixed query q. We then proceeded with the following steps. 1. To make sense of the original proposition, we converted doc d to an attribute vector, where attributes are “kind of” based on terms: doc d→ ~a (d) = (a1 (d) , ....

متن کامل

The Singular Value Decomposition 4 / 4 / 06 Lecturer : Lillian

In today’s lecture we will finally state the Singular Value Decompotion ‘theorem’. To build some intuition for it, we will continue exploring the underlying geometric interpretations of Matrix Theoretic Corpus Characterizations. We wish to develop a general way of succinctly describing properties inherent in some corpora using matrices. Since we already know of a vector space representation for...

متن کامل

CS 6740 : Advanced Language Technologies February 4 , 2010 Lecture 3 : Pivoted Document Length Normalization

In this lecture, we examine the impact of the length of a document on its relevance to queries. We show that document relevance is positively correlated with document length, and see that relevance scores that use the normalization techniques we’ve studied so far (L∞, L1, L2) do not capture this correlation correctly. Finally, we present the “pivoted document length normalization” technique int...

متن کامل

Lectures 7, 8 and 9: October 11, 13 and 18, 1999 Lecturer: Mona Singh Scribes: Ching Law and Casim A. Sarkar

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Lecture 10 – March 2 Lecturer : Prof Lillian Lee Scribes : Jerzy Hausknecht & Kent Sutherland More on Language Models

نویسنده

چکیده

منابع مشابه

Distinguished Lecturer Tour of Norman C. Beaulieu in Skopje, Florence, Podgorica, and Belgrade in March 2014 [Global Communications Newsletter]

CS 630 Notes : Lecture 5 Lecturer

The Singular Value Decomposition 4 / 4 / 06 Lecturer : Lillian

CS 6740 : Advanced Language Technologies February 4 , 2010 Lecture 3 : Pivoted Document Length Normalization

Lectures 7, 8 and 9: October 11, 13 and 18, 1999 Lecturer: Mona Singh Scribes: Ching Law and Casim A. Sarkar

عنوان ژورنال:

اشتراک گذاری